Efficient algorithms for computing rank‐revealing factorizations on a GPU

نویسندگان

چکیده

Standard rank-revealing factorizations such as the singular value decomposition (SVD) and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is inability of standard algorithms cast most operations terms Level-3 BLAS. This article presents two alternative for computing form = U T V ∗ $$ \mathbf{\mathsf{A}}=\mathbf{\mathsf{UT}}{\mathbf{\mathsf{V}}}^{\ast } , where \mathbf{\mathsf{U}} \mathbf{\mathsf{V}} orthogonal \mathbf{\mathsf{T}} trapezoidal (or triangular if \mathbf{\mathsf{A}} square). Both use randomized projection techniques flops matrix-matrix multiplication, which exceptionally efficient Numerical experiments illustrate that these achieve significant acceleration over finely tuned GPU implementations SVD while providing low rank approximation errors close SVD.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Data Mining with Evolutionary Algorithms for Cloud Computing Application

With the rapid development of the internet, the amount of information and data which are produced, are extremely massive. Hence, client will be confused with huge amount of data, and it is difficult to understand which ones are useful. Data mining can overcome this problem. While data mining is using on cloud computing, it is reducing time of processing, energy usage and costs. As the speed of ...

متن کامل

efficient data mining with evolutionary algorithms for cloud computing application

with the rapid development of the internet, the amount of information and data which are produced, are extremely massive. hence, client will be confused with huge amount of data, and it is difficult to understand which ones are useful. data mining can overcome this problem. while data mining is using on cloud computing, it is reducing time of processing, energy usage and costs. as the speed of ...

متن کامل

Provably Efficient GPU Algorithms

In this paper we present an abstract model for algorithm design on GPUs by extending the parallel external memory (PEM) model with computations in internal memory (commonly known as shared memory in GPU literature) defined in the presence of memory banks and bank conflicts. We also present a framework for designing bank conflict free algorithms on GPUs. Using our framework we develop the first ...

متن کامل

I/O-Efficient Algorithms for Computing Contours on a Terrain

A terrain M is the graph of a bivariate function. We assume that M is represented as a triangulated surface with N vertices. A contour (or isoline) of M is a connected component of a level set of M. Generically, each contour is a closed polygonal curve; at “critical” levels these curves may touch each other or collapse to a point. We present I/Oefficient algorithms for the following two problem...

متن کامل

GPU-Vote: A Framework for Accelerating Voting Algorithms on GPU

Voting algorithms, such as histogram and Hough transforms, are frequently used algorithms in various domains, such as statistics and image processing. Algorithms in these domains may be accelerated using GPUs. Implementing voting algorithms efficiently on a GPU however is far from trivial due to irregularities and unpredictable memory accesses. Existing GPU implementations therefore target only...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Numerical Linear Algebra With Applications

سال: 2023

ISSN: ['1070-5325', '1099-1506']

DOI: https://doi.org/10.1002/nla.2515